CS551, Fall 2003: Implement a Sensor Network

Introduction

In this project, you will implement a sensor network. Your program will take configuration files describing a sensor network and network queries. It will spawn Unix processes each of which simulates a single wireless sensor node. You will implement a CSMA MAC layer and a simple data dissemination protocol called one-phase pull.

You should read about socket programming and familiarize yourself with Unix process management system calls like fork() and exec().

You may implement the project in C or C++.

Please do not steal any code from your classmates' submissions. We will use a sophisticated program checker, and if your software is found to have a significant overlap with another submission, both parties will be given an F for the class.

This project is due midnight of November 30, 2003.

Bootstrapping the Network

As already indicated, you will simulate wireless sensor node using Unix processes. Each simulated node will communicate with other nodes using UDP. In order to do this, however, it needs to know the port numbers of its peers. Your first step will be to bootstrap the network using a god.

You will write the god program, which takes two command line arguments, in this order:

N: the number of sensor nodes in the topology.
dirname: the name of the directory in which all sensor node configuration files (described later) exist, and where all output files reside.

The god program should fork N Unix processes (henceforth known as nodes). Prior to forking, god should bind and listen on an available UDP port. When it forks the nodes, it should communicate that port number and the dirname (this is required below), to each node by some means (this is left for you to figure out). Each node, when it starts up, opens an available port for listening to UDP messages on. This port will be used for node-to-node communication. Before this can happen, however, the node must tell god its port number. To do this, the node sends a UDP message to god containing its port number. Once all nodes have done this, god assigns a node identifier to each node (node identifiers are assigned a number between 1 and N). Then, it sends the list of <node identifier, port number> pairs to every node.

In addition, god should print, to a file named god.out, a list of <node identifier, port number> pairs, one pair on each line, separated by a single blank space. This file should be written to the directory specified in the dirname parameter.

At the end of this phase, then, each node knows its own identifier, and it knows the port numbers of all other nodes. It also knows the directory where the node configuration files exist.

Reading the Configuration File

Now, each node can read and process its configuration file, then originate sensor network queries as specified in that file.

At the end of Step 1, each node knows its own identifier, and the node identifiers and port numbers of all other nodes. Once it knows this information, each node reads its configuration file, M.cfg, where M is its identifier. Recall that M is a number between 1 and N, both inclusive. Thus, a node with identifier 2 reads its configuration from 2.cfg. All configuration files reside in the directory specified as an argument to god, and that were passed to the node during Step 1.

The configuration file for a router is as follows:

Exactly one line of the form location x y which specifies the x and y coordinates of the node. You can assume that the coordinates are integers. You can also assume that our sensor network is stationary (i.e., node locations don't change with time). Assume further that the units of the coordinate system is in meters.
Each sensor node has a temperature sensor. The value or value range of this sensor is specified by exactly one line of the form temperature lo hi, which specifies either a single temperature value (if hi, which is optional, is omitted) or a range of values (if both lo and hi are specified). You may assume that both lo and hi are integers. If only lo is specified, the sensor node's temperature reading is fixed. If a range is specified, then the sensor node randomly picks a temperature value between lo and hi every second (i.e., the temperature value changes every 1 second randomly to a value within the specified range).
Optionally (changed) one or more lines of the form query T type [x1,y1,x2,y2]. This line specifies that the node should send a query (described below) at time T seconds after it (the node) starts up. Queries can be of two types: min and max. The last parameter set is optional and describes the coordinates of two opposing corners of a rectangle; if this parameter is specified, the query should only be executed by nodes within the specified rectangle. Thus, a query of the form query 10 min [10,10,30,30] returns the minimum temperature among nodes within the rectangular region whose corners are [10,10] and [30,30].

Here is a sample configuration file: Changed

location 20 25
temperature 50 70
query 5 max 
query 10 min [5,15,30,45]

After reading the configuration file, each node executes the queries specified and then waits for the system to be terminated. We discuss the implementation details and the output format in the next two sections.

Implementation Details

There are several pieces to the implementation of the node functionality. It is convenient to think of these as separate modules/classes and implement them that way. Although we do not enforce a particular implementation, you should carefully define the right modules/classes, implement them in separate files, define the right interfaces and so on. That's the only way you will learn anything from this project.

In the sections below, we discuss the various pieces to the implementation.

Distributing Location Information

After reading the configuration file, each node knows its own geographic location (i.e., its x and y coordinates). Recall that from god, the node also learned the IDs and port numbers for all nodes in the network. Before each node starts executing the queries, it will need to send its coordinates to every other node. This step is needed to simulate the MAC layer, as we discuss below.

The node should do this by sending a UDP message containing the coordinates to all other nodes. The format of the messages is left up to you.

The Radio Layer

Each node has a radio whose range is 20 meters (all nodes within a 20m radius around node A can hear transmissions from A). Furthermore, each node also has a carrier-sense range of 40m (all nodes within 40m of a node A can detect that A is transmitting something, but may not be able to decipher the message). Finally, the radio bandwidth is 19.2Kbps.

Now, using the location information of the other nodes (obtained from the previous step), each node can compute its neighbors (those that can receive messages from it) and its carrier-sense neighbors (those that can hear transmissions from it, but not receive messages from it). These neighbor sets don't change, so a node can compute them once and remember them (for they will be used in implementing the MAC layer). Changed.

Implementing the MAC layer

The MAC will provide two primitives: a broadcast (a message that reaches all neighbors) and a unicast (a message that reaches only specified neighbors).

To implement the broadcast primitive, each node will need to send the message to all its neighbors. You may implement a unicast as a broadcast message with the destination address in the header; for unicasts, nodes other than the destination simply drop the message.

However, that alone is not enough. You will need to implement a simple CSMA MAC: when a node is about to transmit, but detects that a transmission is in progress, it backs off for a randomly specified time and re-tries. CSMA MACs are described in textbooks, so we don't elaborate on the details here; you should certainly read up on them. However, we note here that in order to implement this, each node will not only need to send a message to its neighbors but its carrier-sense neighbors as well (so that those neighbors can know a transmission is in progress, and back off if they are about to transmit). Furthermore, a node that receives a message should determine if it is a neighbor or a carrier-sense neighbor of the sender, and act accordingly. Implement the CSMA MAC after putting some careful thought to understanding how it works, since the implementation can be somewhat tricky.

You will need to decide on a header format for the MAC layer frame. We do not specify this.

Data Dissemination: One-Phase Pull

Now you are ready to implement the one-phase pull variant of Directed Diffusion.

The basic idea behind one-phase pull is as follows. Any node that is asked to originate a query will flood the query throughout the network. You will need to implement a simple flooding protocol; the query originator broadcasts the packet, and each other node that receives the packet re-broadcasts it exactly once.

Now, as the query is flooded, each node also builds up a routing table for responses. This routing table contains a description of the query and the pointer to the neighbor from which the query was first heard (call this the upstream neighbor for the query).

Each node then evaluates the query, and decides if it should respond. For example, if the node is not in the region specified by the query, it should not respond. If the node chooses to respond, it should prepare a response message containing its current temperature reading (remember that a node's temperature reading may change with time) and send it to its upstream neighbor.

Each node may also receive responses from its downstream neighbors. The node should only forward those responses to its upstream neighbor that are relevant to the query. We discuss what relevant means with a simple example. Suppose the query is of type max. The node needs to remember what temperature value it last transmitted to its upstream neighbor; let that temperature value be T. If the node receives a value less than T from its downstream neighbors, it should not forward that value upstream. This is an example of the kind of in-network processing that distinguishes sensor networks from others.

The design of one-phase pull is a bit tricky, so don't write a line of code without having worked out the design of the software in some detail (i.e., you should first write down all the interfaces between modules, all the functions you are going to implement, define precisely the data structures you are going to use etc.). As before, you are free to design the packet formats in any way you please. Here are some things you should be especially careful about:

What information needs to go into the query packet?
What information needs to go into the response packet?
What, precisely, should the routing table contain?
Recall that there may be multiple simultaneous queries in the network. How are you going to distinguish between those queries?
The originator of a query might get several responses (why?). Also remember that the originator of the query might itself have a temperature sensor. Changed.

Output Format

Each node should print out a trace of routing activity. This trace goes into a file M.out, where M is the identifier of the node and the file resides in the directory specified as an argument to god. The following is the syntax of the output file:

When the MAC layer of a node sends a broadcast, it should print out a line of the form mac-sent T K where K is the identifier of the destination if the packet was unicast, and zero if the packet was a broadcast, and T is the time in milliseconds relative to the start time of the node at which the packet was sent.
When the MAC layer of a node receives a message, it should print out a line of the form mac-received T K C, where T and K have the same meanings as above and C has a value Y if the receiver is a carrier-sense neighbor of the sender and a value N otherwise. Changed
When a node receives a flooded query message for the first time, it should print out a message of the form query U T type [X1,Y1,X2,Y2], where U is the identifier of the upstream neighbor, T is the time in milliseconds relative to the start time of the node at which the packet was received (Changed), the type is either max or min and the coordinates represent the coordinates of the box if specified in the query.
When a node sends or forwards a response to its upstream neighbor, it should print out a message of the form response U V T type [X1,Y1,X2,Y2] where V is the temperature value sent and the other fields are as described above.

Your program should continue to run indefinitely, and only terminate when the user sends a QUIT signal (Control-C) to the god process. At this point, god and each node must exit gracefully. God must print a line shutting down to its output file before exiting. When a node exits as a result of the QUIT signal it must print to the file M.out (where M is the node's identifier):

For each query that the node was asked to originate, a line of the form query-answer A T type [X1,Y1,X2,Y2] where A is the answer to the query, and all the other fields are as described above. You may assume that a node's configuration file will not contain two identical queries.

Changed

File Layout

I want you to learn some basic concepts about Makefile and compiling programs written using multiple files. So, your project must have the following:

One header file for each component of your system. This header file should contain the data structure or class definitions for that component. For example, the MAC layer packet header struct might be defined in a file called mac.h.
Each component should be implemented in a separate C/C++ file.
Makefile: Now that you have have so many files you will need to create a makefile so that you can do all the compilations by just a word ``make''. Your makefile should take care of cleaning up the directory also, i.e. ``make clean'' should remove all the old object files and all the binary files. Read the make(1) man page.

Submitting Your Project

You should submit the project using submit command. The exact syntax of the command is the following. You have to submit all the files i.e., C/C++ files, header files, and the makefile.

% submit -user csci551 -tag proj2 file1 [ file2 ... ]

You are not required to submit any written material.

Evaluating Your Project

Here is how we will evaluate your submission. First, we will copy all the files you have submitted into an empty directory, and then type the following command:

% make

You should structure your makefile so that the make produces an executable called god. We will then run god, as specified above, using configuration files that we create. We will not tell you in advance what nodes description we intend to use. You may, however, assume that the topology description will be syntactically correct.

After running the program for a few minutes, we will send a QUIT signal to god, and then grade your project based on the outputs observed in the output files.

Test Inputs

Here are some test inputs for your program. Please be sure to test your program with your own test inputs as well. Don't just rely on these two sample input sets!

Suggested Schedule

To help you complete your project successfully, here is a suggested schedule. This is a somewhat aggressive schedule, so don't panic if you fall behind, but try to keep close to it. I'd recommend debugging your code piecemeal; don't wait to write all the code before debugging it. If you have working parts, it'll be easier to give you significant credit. As well, it'll be easier to debug the later sections of code.

Before Nov 7th, you should have completed a high-level design of your project, and have read the associated documentation (e.g., whatever you need to know of socket programming).

By Nov. 12th, you should have finished coding the bootstrapping section of the code, and by Nov 15th have tested out the ability to create a number of processes and be able to test passing information between the god process and the ``nodes''.

By Nov 21st, you should have a working MAC layer, and by Nov 26th a working one-phase pull routing implementation. Use the remaining time to perfect your project.

Ramesh Govindan
ramesh@usc.edu
November 25, 2003